659 research outputs found

    "i have a feeling trump will win..................": Forecasting Winners and Losers from User Predictions on Twitter

    Full text link
    Social media users often make explicit predictions about upcoming events. Such statements vary in the degree of certainty the author expresses toward the outcome:"Leonardo DiCaprio will win Best Actor" vs. "Leonardo DiCaprio may win" or "No way Leonardo wins!". Can popular beliefs on social media predict who will win? To answer this question, we build a corpus of tweets annotated for veridicality on which we train a log-linear classifier that detects positive veridicality with high precision. We then forecast uncertain outcomes using the wisdom of crowds, by aggregating users' explicit predictions. Our method for forecasting winners is fully automated, relying only on a set of contenders as input. It requires no training data of past outcomes and outperforms sentiment and tweet volume baselines on a broad range of contest prediction tasks. We further demonstrate how our approach can be used to measure the reliability of individual accounts' predictions and retrospectively identify surprise outcomes.Comment: Accepted at EMNLP 2017 (long paper

    Investigating Reasons for Disagreement in Natural Language Inference

    Full text link
    We investigate how disagreement in natural language inference (NLI) annotation arises. We developed a taxonomy of disagreement sources with 10 categories spanning 3 high-level classes. We found that some disagreements are due to uncertainty in the sentence meaning, others to annotator biases and task artifacts, leading to different interpretations of the label distribution. We explore two modeling approaches for detecting items with potential disagreement: a 4-way classification with a "Complicated" label in addition to the three standard NLI labels, and a multilabel classification approach. We found that the multilabel classification is more expressive and gives better recall of the possible interpretations in the data.Comment: accepted at TACL, pre-MIT Press publication versio

    Challenges and solutions for Latin named entity recognition

    Get PDF
    Although spanning thousands of years and genres as diverse as liturgy, historiography, lyric and other forms of prose and poetry, the body of Latin texts is still relatively sparse compared to English. Data sparsity in Latin presents a number of challenges for traditional Named Entity Recognition techniques. Solving such challenges and enabling reliable Named Entity Recognition in Latin texts can facilitate many down-stream applications, from machine translation to digital historiography, enabling Classicists, historians, and archaeologists for instance, to track the relationships of historical persons, places, and groups on a large scale. This paper presents the first annotated corpus for evaluating Named Entity Recognition in Latin, as well as a fully supervised model that achieves over 90% F-score on a held-out test set, significantly outperforming a competitive baseline. We also present a novel active learning strategy that predicts how many and which sentences need to be annotated for named entities in order to attain a specified degree of accuracy when recognizing named entities automatically in a given text. This maximizes the productivity of annotators while simultaneously controlling quality

    The prosody of presupposition projection in naturally-occurring utterances

    Get PDF
    In experimental studies, prosodically-marked pragmatic focus has been found to influence the projection of factive presuppositions of utterances like these parents didn’t know the kid was gone (Cummins and Rohde, 2015; Tonhauser, 2016; Dj¹arv and Bacovcin, 2017), supporting question-based analyses of projection (i.a., Abrus®an, 2011; Abrus®an, 2016; Simons et al., 2017; Beaver et al., 2017). However, no prior work has explored whether this effect extends to naturally-occurring utterances. In a large set of naturally-occurring utterances, we find that prosodically-marked focus influences projection in utterances with factive embedding predicates, but not those with non-factive predicates. We argue that our findings support an account where lexical semantics of the predicate contributes to projection to the extent that they admit QUD alternatives that can be assumed to entail the content of the complement

    Deviations from plastic barriers in Bi2_2Sr2_2CaCu2_2O8+ÎŽ_{8+\delta} thin films

    Full text link
    Resistive transitions of an epitaxial Bi2_2Sr2_2CaCu2_2O8+Ύ_{8+\delta} thin film were measured in various magnetic fields (H∄cH\parallel c), ranging from 0 to 22.0 T. Rounded curvatures of low resistivity tails are observed in Arrhenius plot and considered to relate to deviations from plastic barriers. In order to characterize these deviations, an empirical barrier form is developed, which is found to be in good agreement with experimental data and coincide with the plastic barrier form in a limited magnetic field range. Using the plastic barrier predictions and the empirical barrier form, we successfully explain the observed deviations.Comment: 5 pages, 6 figures; PRB 71, 052502 (2005

    Ecologically Valid Explanations for Label Variation in NLI

    Full text link
    Human label variation, or annotation disagreement, exists in many natural language processing (NLP) tasks, including natural language inference (NLI). To gain direct evidence of how NLI label variation arises, we build LiveNLI, an English dataset of 1,415 ecologically valid explanations (annotators explain the NLI labels they chose) for 122 MNLI items (at least 10 explanations per item). The LiveNLI explanations confirm that people can systematically vary on their interpretation and highlight within-label variation: annotators sometimes choose the same label for different reasons. This suggests that explanations are crucial for navigating label interpretations in general. We few-shot prompt large language models to generate explanations but the results are inconsistent: they sometimes produces valid and informative explanations, but it also generates implausible ones that do not support the label, highlighting directions for improvement.Comment: Findings at EMNLP 2023. Overlap with previous version arXiv:2304.1244

    Synthesizing Java expressions from free-form queries

    Full text link

    Over ‘sexed’ regulation and the disregarded worker: an overview of the impact of sexual entertainment policy on lap-dancing club workers

    Get PDF
    In England and Wales, with the introduction of Section 27 of the Policing and Crime Act 2009, lap-dancing clubs can now be licensed as Sexual Entertainment Venues. This article considers such, offering a critique of Section 27, arguing that this legislation is not evidence-based, with lap-dancing policy, like other sex-work policies, often associated with crime, deviance and immorality. Furthermore, it is argued that sex-work policies are gradually being homogenised as well as increasingly criminalised. Other criticisms relate to various licensing loopholes which lead to some striptease venues remaining unlicensed and unregulated, potentially impacting on the welfare of erotic dancers. In addition, restrictions on the numbers of lap-dancing venues may exacerbate dancer unemployment, drawing these women into poverty. Finally, The Policing and Crime Act reflects how the political focus is being directed away from the exploitation of workers, on to issues relating to crime and deviance, despite limited evidence to support this focus
    • 

    corecore